AITopics | relative depth

Collaborating Authors

relative depth

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Single-Image Depth Perception in the Wild

Weifeng Chen, Zhao Fu, Dawei Yang, Jia Deng

Neural Information Processing SystemsMar-23-2026, 01:24:11 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, social media, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.46)

Add feedback

RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

Neural Information Processing SystemsFeb-18-2026, 04:42:32 GMT

We propose a method for metric-scale monocular depth estimation. Inferring depth from a single image is an ill-posed problem due to the loss of scale from perspective projection during the image formation process.

machine learning, natural language, nyuv2, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.69)
(2 more...)

Add feedback

The geometry of hidden representations of large transformer models

Neural Information Processing SystemsFeb-16-2026, 05:43:08 GMT

By analyzing the intrinsic dimension (ID) and neighbor composition, we find that the representations evolve similarly in transformers trained on protein language tasks and image reconstruction tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.05)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

Neural Information Processing SystemsOct-10-2025, 16:49:43 GMT

depth estimation, kitti 0, nyuv2, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.69)
(2 more...)

Add feedback

The geometry of hidden representations of large transformer models

Neural Information Processing SystemsOct-9-2025, 03:06:58 GMT

By analyzing the intrinsic dimension (ID) and neighbor composition, we find that the representations evolve similarly in transformers trained on protein language tasks and image reconstruction tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.05)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Order-aware Interactive Segmentation

Wang, Bin, Choudhuri, Anwesa, Zheng, Meng, Gao, Zhongpai, Planche, Benjamin, Deng, Andong, Liu, Qin, Chen, Terrence, Bagci, Ulas, Wu, Ziyan

arXiv.org Artificial IntelligenceOct-17-2024

Interactive segmentation aims to accurately segment target objects with minimal user interactions. However, current methods often fail to accurately separate target objects from the background, due to a limited understanding of order, the relative depth between objects in a scene. To address this issue, we propose OIS: order-aware interactive segmentation, where we explicitly encode the relative depth between objects into order maps. We introduce a novel order-aware attention, where the order maps seamlessly guide the user interactions (in the form of clicks) to attend to the image features. We further present an object-aware attention module to incorporate a strong object-level understanding to better differentiate objects with similar order. Our approach allows both dense and sparse integration of user clicks, enhancing both accuracy and efficiency as compared to prior works. Experimental results demonstrate that OIS achieves state-of-the-art performance, improving mIoU after one click by 7.61 on the HQSeg44K dataset and 1.32 on the DAVIS dataset as compared to the previous state-of-the-art SegNext, while also doubling inference speed compared to current leading methods. The project page is https://ukaukaaaa.github.io/projects/OIS/index.html

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2410.12214

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation

Atar, Soofiyan, Zhi, Yuheng, Richter, Florian, Yip, Michael

arXiv.org Artificial IntelligenceSep-28-2024

Depth perception is essential for a robot's spatial and geometric understanding of its environment, with many tasks traditionally relying on hardware-based depth sensors like RGB-D or stereo cameras. However, these sensors face practical limitations, including issues with transparent and reflective objects, high costs, calibration complexity, spatial and energy constraints, and increased failure rates in compound systems. While monocular depth estimation methods offer a cost-effective and simpler alternative, their adoption in robotics is limited due to their output of relative rather than metric depth, which is crucial for robotics applications. In this paper, we propose a method that utilizes a single calibrated camera, enabling the robot to act as a ``measuring stick" to convert relative depth estimates into metric depth in real-time as tasks are performed. Our approach employs an LSTM-based metric depth regressor, trained online and refined through probabilistic filtering, to accurately restore the metric depth across the monocular depth map, particularly in areas proximal to the robot's motion. Experiments with real robots demonstrate that our method significantly outperforms current state-of-the-art monocular metric depth estimation techniques, achieving a 22.1% reduction in depth error and a 52% increase in success rate for a downstream task.

artificial intelligence, image understanding, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.1949

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Industry:

Education (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Pre-processing and Compression: Understanding Hidden Representation Refinement Across Imaging Domains via Intrinsic Dimension

Konz, Nicholas, Mazurowski, Maciej A.

arXiv.org Machine LearningSep-9-2024

In recent years, there has been interest in how geometric properties such as intrinsic dimension (ID) of a neural network's hidden representations change through its layers, and how such properties are predictive of important model behavior such as generalization ability. However, evidence has begun to emerge that such behavior can change significantly depending on the domain of the network's training data, such as natural versus medical images. Here, we further this inquiry by exploring how the ID of a network's learned representations changes through its layers, in essence, characterizing how the network successively refines the information content of input data to be used for predictions. Analyzing eleven natural and medical image datasets across six network architectures, we find that how ID changes through the network differs noticeably between natural and medical image models. Specifically, medical image models peak in representation ID earlier in the network, implying a difference in the image features and their abstractness that are typically used for downstream tasks in these domains. Additionally, we discover a strong correlation of this peak representation ID with the ID of the data in its input space, implying that the intrinsic information content of a model's learned representations is guided by that of the data it was trained on. Overall, our findings emphasize notable discrepancies in network behavior between natural and non-natural imaging domains regarding hidden representation information content, and provide further insights into how a network's learned features are shaped by its training data.

intrinsic repr, relative depth, repr intrinsic repr, (15 more...)

arXiv.org Machine Learning

2408.08381

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Flanders (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Single-Image Depth Perception in the Wild

Neural Information Processing SystemsMar-12-2024, 07:32:34 GMT

This paper studies single-image depth perception in the wild, i.e., recovering depth from a single image taken in unconstrained settings. We introduce a new dataset "Depth in the Wild" consisting of images in the wild annotated with relative depth between pairs of random points. We also propose a new algorithm that learns to estimate metric depth using annotations of relative depth. Compared to the state of the art, our algorithm is simpler and performs better. Experiments show that our algorithm, combined with existing RGB-D data and our new relative depth annotations, significantly improves single-image depth perception in the wild.

annotation, dataset, relative depth, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.46)

Add feedback

EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models

Pan, Xuchen, Chen, Yanxi, Li, Yaliang, Ding, Bolin, Zhou, Jingren

arXiv.org Artificial IntelligenceFeb-1-2024

This work introduces EE-Tuning, a lightweight and economical solution to training/tuning early-exit large language models (LLMs). In contrast to the common approach of full-parameter pre-training, EE-Tuning augments any pre-trained (and possibly fine-tuned) standard LLM with additional early-exit layers that are tuned in a parameter-efficient manner, which requires significantly less computational resources and training data. Our implementation of EE-Tuning achieves outstanding training efficiency via extensive performance optimizations, as well as scalability due to its full compatibility with 3D parallelism. Results of systematic experiments validate the efficacy of EE-Tuning, confirming that effective early-exit LLM inference can be achieved with a limited training budget. In hope of making early-exit LLMs accessible to the community, we release the source code of our implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.

early exit, early-exit layer, ee-tuning, (17 more...)

arXiv.org Artificial Intelligence

2402.00518

Country: